Automatic language identification with perceptually guided training and recurrent neural networks
نویسندگان
چکیده
We present a novel approach to Automatic Language Identification (LID). We propose Perceptually Guided Training (PGT), a novel LID training method, involving identification of utterance parts which are particularly significant perceptually for the language identification process, and exploitation of these Perceptually Significant Regions (PSRs) to guide the LID training process. Our approach involves a Recurrent Neural Network (RNN) as the main mechanism. We propose that, because of the long-range intrautterance acoustical context significance in LID, RNNs are particularly suitable for the LID task. Our approach does not require phonetic labeling or transcription of the training corpus. LIREN/PGT, the LID system we developed, incorporates our approach. Our LID experiments were on English, German, and Mandarin Chinese, using the OGI-TS corpus.
منابع مشابه
Knowledge as a Teacher: Knowledge-Guided Structural Attention Networks
Natural language understanding (NLU) is a core component of a spoken dialogue system. Recently recurrent neural networks (RNN) obtained strong results on NLU due to their superior ability of preserving sequential information over time. Traditionally, the NLU module tags semantic slots for utterances considering their flat structures, as the underlying RNN structure is a linear chain. However, n...
متن کاملNonlinear System Identification for Predictive Control using Continuous Time Recurrent Neural Networks and Automatic Differentiation
In this paper, a continuous time recurrent neural network (CTRNN) is developed to be used in nonlinear model predictive control (NMPC) context. The neural network represented in a general nonlinear statespace form is used to predict the future dynamic behavior of the nonlinear process in real time. An efficient training algorithm for the proposed network is developed using automatic differentia...
متن کاملUsing word confusion networks for slot filling in spoken language understanding
Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU) because of automatic speech recognition (ASR) errors. To improve the performance of slot filling, a successful approach is to use a statistical model that is trained on ASR one-best hypotheses. The state of the art models for slot filling rely on using discriminative sequence modeling methods, s...
متن کاملEnd-to-End Language Identification Using Attention-Based Recurrent Neural Networks
This paper proposes a novel attention-based recurrent neural network (RNN) to build an end-to-end automatic language identification (LID) system. Inspired by the success of attention mechanism on a range of sequence-to-sequence tasks, this work introduces the attention mechanism with long short term memory (LSTM) encoder to the sequence-to-tag LID task. This unified architecture extends the end...
متن کاملAccent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features
Automatic identification of foreign accents is valuable for many speech systems, such as speech recognition, speaker identification, voice conversion, etc. The INTERSPEECH 2016 Native Language Sub-Challenge is to identify the native languages of non-native English speakers from eleven countries. Since differences in accent are due to both prosodic and articulation characteristics, a combination...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998